Evaluating the Scalability of Data Mining Provider Classifiers

نویسندگان

  • C. L. Curotto
  • N. F. F. Ebecken
چکیده

Two classifiers implemented as Data Mining Providers are considered. These providers runs as a stand-alone servers or aggregated with Microsoft® SQL Server. One of these classifiers is the Microsoft® Decision Trees algorithm. The other is the Simple Naive Bayes incremental classifier, that supports continuous input attributes, multiple discrete predictable attributes and incremental updating of the training data set. The performance study carried out to verify the scalability of the classifiers includes factors of cardinality (number of training cases), number of input attributes, number of states of the input attributes and number of predictable attributes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pruning Meta-Classifiers in a Distributed Data Mining System

JAM is a powerful and portable agent-based distributed data mining system that employs metalearning techniques to integrate a number of independent classifiers (models) derived in parallel from independent and (possibly) inherently distributed databases. Although meta-learning promotes scalability and accuracy in a simple and straightforward manner, brute force metalearning techniques can resul...

متن کامل

Pruning Meta-Classifiers in a Distributed Data Mining System CUCS-032-97

JAM is a powerful and portable agent-based distributed data mining system that employs metalearning techniques to integrate a number of independent classifiers (models) derived in parallel from independent and (possibly) inherently distributed databases. Although meta-learning promotes scalability and accuracy in a simple and straightforward manner, brute force meta-learning techniques can resu...

متن کامل

Pruning Classifiers in a Distributed Meta-Learning System

JAM is a powerful and portable agent-based distributed data mining system that employs meta-learning techniques to integrate a number of independent classifiers (concepts) derived in parallel from independent and (possibly) inherently distributed databases. Although metalearning promotes scalability and accuracy in a simple and straightforward manner, brute force meta-learning techniques can re...

متن کامل

Efficient Data Mining with Evolutionary Algorithms for Cloud Computing Application

With the rapid development of the internet, the amount of information and data which are produced, are extremely massive. Hence, client will be confused with huge amount of data, and it is difficult to understand which ones are useful. Data mining can overcome this problem. While data mining is using on cloud computing, it is reducing time of processing, energy usage and costs. As the speed of ...

متن کامل

Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining

This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003